52 research outputs found

    A cDNA Microarray Gene Expression Data Classifier for Clinical Diagnostics Based on Graph Theory

    Get PDF
    Despite great advances in discovering cancer molecular profiles, the proper application of microarray technology to routine clinical diagnostics is still a challenge. Current practices in the classification of microarrays' data show two main limitations: the reliability of the training data sets used to build the classifiers, and the classifiers' performances, especially when the sample to be classified does not belong to any of the available classes. In this case, state-of-the-art algorithms usually produce a high rate of false positives that, in real diagnostic applications, are unacceptable. To address this problem, this paper presents a new cDNA microarray data classification algorithm based on graph theory and is able to overcome most of the limitations of known classification methodologies. The classifier works by analyzing gene expression data organized in an innovative data structure based on graphs, where vertices correspond to genes and edges to gene expression relationships. To demonstrate the novelty of the proposed approach, the authors present an experimental performance comparison between the proposed classifier and several state-of-the-art classification algorithm

    Gene expression classifiers and out-of-class samples detection

    Get PDF
    The proper application of statistics, machine learning, and data-mining techniques in routine clinical diagnostics to classify diseases using their genetic expression profile is still a challenge. One critical issue is the overall inability of most state-of-the-art classifiers to identify out-of-class samples, i.e., samples that do not belong to any of the available classes. This paper shows a possible explanation for this problem and suggests how, by analyzing the distribution of the class probability estimates generated by a classifier, it is possible to build decision rules able to significantly improve its performance

    Gene expression reliability estimation through cluster-based analysis

    Get PDF
    Gene expression is the fundamental control of the structure and functions of the cellular versatility and adaptability of any organisms. The measurement of gene expressions is performed on images generated by optical inspection of microarray devices which allow the simultaneous analysis of thousands of genes. The images produced by these devices are used to calculate the expression levels of mRNA in order to draw diagnostic information related to human disease. The quality measures are mandatory in genes classification and in the decision-making diagnostic. However, microarrays are characterized by imperfections due to sample contaminations, scratches, precipitation or imperfect gridding and spot detection. The automatic and efficient quality measurement of microarray is needed in order to discriminate faulty gene expression levels. In this paper we present a new method for estimate the quality degree and the data's reliability of a microarray analysis. The efficiency of the proposed approach in terms of genes expression classification has been demonstrated through a clustering supervised analysis performed on a set of three different histological samples related to the Lymphoma's cancer diseas

    GPU acceleration for statistical gene classification

    Get PDF
    The use of Bioinformatic tools in routine clinical diagnostics is still facing a number of issues. The more complex and advanced bioinformatic tools become, the more performance is required by the computing platforms. Unfortunately, the cost of parallel computing platforms is usually prohibitive for both public and small private medical practices. This paper presents a successful experience in using the parallel processing capabilities of Graphical Processing Units (GPU) to speed up bioinformatic tasks such as statistical classification of gene expression profiles. The results show that using open source CUDA programming libraries allows to obtain a significant increase in performances and therefore to shorten the gap between advanced bioinformatic tools and real medical practic

    Differential gene expression graphs: A data structure for classification in DNA microarrays

    Get PDF
    This paper proposes an innovative data structure to be used as a backbone in designing microarray phenotype sample classifiers. The data structure is based on graphs and it is built from a differential analysis of the expression levels of healthy and diseased tissue samples in a microarray dataset. The proposed data structure is built in such a way that, by construction, it shows a number of properties that are perfectly suited to address several problems like feature extraction, clustering, and classificatio

    A graph-based representation of Gene Expression profiles in DNA microarrays

    Get PDF
    This paper proposes a new and very flexible data model, called gene expression graph (GEG), for genes expression analysis and classification. Three features differentiate GEGs from other available microarray data representation structures: (i) the memory occupation of a GEG is independent of the number of samples used to built it; (ii) a GEG more clearly expresses relationships among expressed and non expressed genes in both healthy and diseased tissues experiments; (iii) GEGs allow to easily implement very efficient classifiers. The paper also presents a simple classifier for sample-based classification to show the flexibility and user-friendliness of the proposed data structur

    Gene Expression vs. Network Attractors

    Get PDF
    Microarrays, RNA-Seq, and Gene Regulatory Networks (GRNs) are common tools used to study the regulatory mechanisms mediating the expression of the genes involved in the biological processes of a cell. Whereas microarrays and RNA-Seq provide a snapshot of the average expression of a set of genes of a population of cells, GRNs are used to model the dynamics of the regulatory dependencies among a subset of genes believed to be the main actors in a biological process. In this paper we discuss the possibility of correlating a GRN dynamics with a gene expression profile extracted from one or more wet-lab expression experiments. This is more a position paper to promote discussion than a research paper with final results

    GPU cards as a low cost solution for efficient and fast classification of high dimensional gene expression datasets

    Get PDF
    The days when bioinformatics tools will be so reliable to become a standard aid in routine clinical diagnostics are getting very close. However, it is important to remember that the more complex and advanced bioinformatics tools become, the more performances are required by the computing platforms. Unfortunately, the cost of High Performance Computing (HPC) platforms is still prohibitive for both public and private medical practices. Therefore, to promote and facilitate the use of bioinformatics tools it is important to identify low-cost parallel computing solutions. This paper presents a successful experience in using the parallel processing capabilities of Graphical Processing Units (GPU) to speed up classification of gene expression profiles. Results show that using open source CUDA programming libraries allows to obtain a significant increase in performances and therefore to shorten the gap between advanced bioinformatics tools and real medical practic

    Gene expression classifiers and out-of-class samples detection

    Get PDF
    The proper application of statistics, machine learning, and data-mining techniques in routine clinical diagnostics to classify diseases using their genetic expression profile is still a challenge. One critical issue is the overall inability of most state-of-the-art classifiers to identify out-of-class samples, i.e., samples that do not belong to any of the available classes. This paper shows a possible explanation for this problem and suggests how, by analyzing the distribution of the class probability estimates generated by a classifier, it is possible to build decision rules able to significantly improve its performances

    An agent-based simulation framework for complex systems

    Get PDF
    In this abstract we present a new approach to the simulation of complex systems as biological interaction networks, chemical reactions, ecosystems, etc. It aims at overcoming previously proposed analytical approaches that, because of several computational challenges, could not handle systems of realistic com- plexity. The proposed model is based on a set of agents interacting through a shared environment. Each agent functions independently from the others, and its be- havior is driven only by its current status and the "content" of the surrounding environment. The environment is the only "data repository" and does not store the value of variables, but only their presence and concentration. Each agent performs 3 main functions: 1. it samples the environment at random locations 2. based on the distribution of the sampled data and a proper Transfer Func- tion, it computes the rate at which the output values are generated 3. it writes the output "products" at random locations. The environment is modeled as a Really Random Access Memory (R2AM). Data is written and sampled at random memory locations. Each memory location represent an atomic sample (a molecule, a chemical compound, a protein, an ion, . . . ). Presence and concentration of these samples are what constitutes the environment data set. The environment can be sensitive to external stimuli (e.g., pH, Temperature, ...) and can include topological information to allow its partitioning (e.g. between nucleus and cytoplasm in a cell) and the modeling of sample "movements" within the environment. The proposed approach is easily scalable in both complexity and computa- tional costs. Each module could implement a very simple object as a single chemical reaction or a very complex process as a gene translation into a pro- tein. At the same time, from the hardware point of view, the complexity of the objects implementing a single agent can range from a single software process to a dedicated computer or hardware platfor
    corecore